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Plant disease is a challenge in the agricultural sector, especially for rice 
production. Identifying diseases in rice leaves is the first step to wipe out and 
treat diseases to reduce crop failure. With the rapid development of the 
convolutional neural network (CNN), rice leaf disease can be recognized well 
without the help of an expert. In this research, the performance evaluation of 
CNN architecture will be carried out to analyze the classification of rice leaf 
disease images by classifying 5932 image data which are divided into 4 
disease classes. The comparison of training data, validation, and testing are 
60:20:20. Adam optimization with a learning rate of 0.0009 and softmax 
activation was used in this study. From the experimental results, the 
InceptionV3 and InceptionResnetV2 architectures got the best accuracy, 
namely 100%, ResNet50 and DenseNet201 got 99.83%, MobileNet 99.33%, 
and EfficientNetB3 90.14% accuracy. 
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1. INTRODUCTION 

Rice is one of the food commodities for most people in world. For example Indonesia, rice 
production in 2019 amounted to 54.60 million tonnes of milled dry unhulled with a harvest area of 10.68 
million hectares [1]. However, rice plants are often exposed to the disease during their growing period. 
Disease in rice frequently attacks the leaves. Fungi and bacteria are believed to be the main causes of the 
disease [2]. Some of the diseases that often attack rice leaves are Brown Spot, Tungro, Bacterial Leaf Blight, 
and Blast. Rice leaf disease can result in lowly rice plant growth, resulting in decreased production. 

From one study, it is estimated that the decrease in rice production due to blast disease can reach 
100% [3], bacterial leaf blight 15-25% [4], and brown leaf spot 40% [5]. If not handled seriously, it will 
bring huge economic losses for rice farmers. Some farmers, in general, do not know enough about rice leaf 
diseases, especially young farmers who lack expertise in agriculture so that it is difficult to identify the type 
of disease. Without knowing the type of disease, it will be difficult to choose suitable drugs and great 
handling procedures. Therefore, it is necessary to be able to classify the types of diseases in rice leaves. 

In recent years, many studies have been conducted on the classification of rice leaf disease. 
Research has been conducted using the k-nearest neighbor (KNN) method to classify Blast and Brown Spot 
disease in rice leaves obtained an accuracy of 76.59% [6]. Other studies have been conducted to detect leaf 
smut, bacterial leaf blight, and brown spots on rice leaves. The study conducted an experiment using the 
logistic regression algorithm, obtained an accuracy of 70.83%, KNN 91.67%, 97.91% decision tree (DT), and 
50% naive bayes (NB) [7]. Further research was carried out using the support vector machine (SVM) method 
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to diagnose Brown Spot and Blast disease in rice leaves with a total data of 120, obtained an accuracy of 
95.5% [8]. SVM is also used for the classification process with the addition of the k-means clustering method 
and obtained an accuracy of 92.06% [9]. The artificial neural network (ANN) method has been used for the 
classification of rice leaf disease by segmenting and feature extraction on images [10]. 

However, to get a good level of accuracy still depends on the feature selection technique when using 
this method. Recent research on convolutional neural network (CNN) has contributed greatly to image-based 
identification by eliminating the need for pre-processing images and having built-in feature selection [11]. 
Research using the CNN method has been [12] carried out to identify 10 disease classes in rice leaves and 
stems using 500 image data, from the test results obtained an accuracy of 95.48% using 10 Fold Cross- 
Validation. Other studies were also conducted using CNN architectural models such as VGG 16 with 92.46% 
accuracy [11], AlexNet 91.23% accuracy [13], and VGG 19 with 92% accuracy [14]. In recent years, many 
researchers have been developing new CNN architecture. So it is necessary to research the performance of 
the new CNN architectural model for the classification of rice leaf disease. 

This paper will conduct a study that focuses on the classification of 4 types of rice leaf diseases, 
namely Bacterial Leaf Blight, Blast, Tungro, and Brown Spot using 6 types of CNN architectures, namely 
InceptionV3, ResNet50, InceptionResnetV2, DenseNet201, MobileNet, and EfficientNetB3. By comparing 
all the CNN architectures, performance evaluation will be carried out to see the best way to recognize rice 
leaf disease. 


2. RESEARCH METHOD 

This section explains the research flow regarding the classification of rice leaf disease starting from 
the acquisition of datasets, pre-processing data, the use of the convolutional neural network (CNN) 
architectural model, and evaluation of the performance of the CNN architectural model as shown in Figure 1. 
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Figure 1. Research flow 


2.1. Dataset 

In this study, the dataset used was taken from Mendeley Data which is a collection of pictures of 
rice leaf disease [15]. This dataset has 5932 rice leaf disease image data consisting of 4 disease classes 
including Bacterial Blight, Blast, Bown Spot, and Tungro disease as shown in Figure 2. All image data 
obtained in this study are stored in JPG format. 


(a) (d) 


Figure 2. Types of disease on rice leaves; (a) Blast, (b) Bacteria Blight, (c) Tungro, (d) Brown Spot 


Int J Artif Intell, Vol. 10, No. 4, December 2021: 1069 - 1078 


Int J Artif Intell ISSN: 2252-8938 1071 


2.2. Pre-processing data 

The dataset is divided into 60% training, 20% validation, and 20% testing. Then each image will be 
resized to 300x300 pixels. In the training data, an augmentation process will be carried out by rotating up to 
40 degrees, shifting the image to a scale of 0.2, enlarging the image to a scale of 0.2, and flipping the image 
vertically and horizontally. So that the number of images in the training data increases 6 times including 
images that apply augmentation. Figure 3 shows the sample data from the augmentation results. The disease 
name and the data amount used in this study can be seen in Table 1. 


Figure 3. Augmentation results from the image of blast disease 


Table 1. Details of rice leaf disease dataset 


Leaf diseases Original image 5 eee ah qugmontation Validation (20%) Testing (20%) 
Bacteril Blight 1584 1267 7602 317 317 
Blast 1440 1152 6912 288 288 
Brown Spot 1600 1280 7680 320 320 
Tungro 1308 1046 6276 262 262 
Total 5932 4745 28470 1187 1187 


2.3. Model CNN 

Convolutional neural networks (CNNs) are the most popular deep learning algorithms for 
researchers to test [16]. CNN was first introduced in 1988 by Yann LeCun [17]. The introduction of CNN has 
changed the way problems are solved in the classification of an image [18]. In this study, six types of CNN 
architecture will be used to conduct experiments in classifying diseases in rice leaves, including InceptionV3 
[19], ResNet50 [20], InceptionResnetV2 [21], DenseNet201 [22], MobileNet [23], and EfficientNetB3 [24]. 
The following will briefly explain the CNN architecture used in this study. 


2.3.1. InceptionV3 

InceptionV3 is a CNN architecture developed by Google at the imagenet large scale visual 
recognition challenge ILSVRC) in 2012 [19]. InceptionV3 was developed to parse convolution [19]. This 
means that each convolution can be replaced by a convolution followed by a convolution, this can parse 
many parameters, avoid the problem of redundant fitting and strengthen the ability of nonlinear expressions 
[25]. 


2.3.2. ResNet50 

ResNet50 is a CNN architecture that introduces the concept of shortcut connections [20]. The 
concept emergence of shortcut connections in the ResNet50 architecture is related to the vanishing gradient 
problem that occurs when efforts to deepen the structure of a network are carried out. However, deepening a 
network with the aim of improving its performance cannot be done simply by stacking layers. The deeper a 
network can lead to a vanishing gradient problem that can make the small gradient which results in decreased 
performance or accuracy [20]. 


2.3.3. InceptionResnet V2 

InceptionResNetV2 was first introduced by Szegedy et al. [21]. InceptionResNetV2 architecture is a 
compound of the Inception structure and residual module. The convolution filter is combined with the 
residual connection which aims to avoid the problems caused by the deeper structure. Residual connection 
can reduce the time during the training process [26]. InceptionResnetV1 and InceptionResnetV2 have the 
same overall structure, but different modules in the network. 
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2.3.4. DenseNet201 

DenseNet is a CNN architecture that introduces short connections that connect each layer directly to 
the other layers in a feed-forward manner. DenseNet has a narrow layer with a small set of feature maps that 
belong to the network [22]. 


2.3.5. MobileNet 

MobileNet was first introduced by researchers from Google [23], to overcome the need for large 
computing resources so that this architecture can be used for mobile phones. MobileNet uses a convolution 
screen with a filter thickness that matches the thickness of the input. The MobileNet architecture is divided 
into 2 types of convolution, namely depthwise convolution and pointwise convolution. 


2.3.6. EfficienNetB3 

EfficienNet was first proposed by Tan and Lee in 2019 [24] and an architecture used to optimize 
classification networks. In general, there are 3 indicators used by most networks, including widening the 
network, deepening the network and increasing the resolution quality. Therefore, the application of the 
combined scaling model is applied to optimize the network width, depth and network resolution to improve 
accuracy [25]. 

Previously, the model used in this experiment was trained using a dataset from ImageNet. All pre- 
trained models on the CNN architecture by default have 1000 fully connected (FC) layer output nodes. For 
the output FC layer, it will be replaced with 4 nodes according to the number of classes in rice leaf disease 
and added with Softmax activation. 


2.4. Evaluation of CNN architecture model 

To evaluate and compare the performance of a tried CNN architecture. The first evaluation is carried 
out on training data and validation by calculating accuracy and loss and calculating the computation time of 
each CNN architecture. Furthermore, an evaluation of the architectural model that has been trained with 
testing data using the Confusion Matrix will be carried out, including calculating accuracy, precision, recall, 
and F1 Score. The following confusion matrix multiclass used in this study is shown in Table 2. 


Table 2. Confusion matrix for multiclass 
Predicted Number 


Class 1 Class2 _...._ Classn 
Class 1 Xu Xy2 ‘ewes Xin 
Actual Number Class 2 Xai Xap oe Xan 
Class n Xn Xn2 aay Xan 


From the Table 2, we will get the number of true positives (TTP) for all classes, true negative 
(TTN), false positive (TFP), and false negative (TFN) for each class i which is calculated using (1)-(4) [27]. 


TPPau = Ded Xjj (1) 

TTN, = Dj a1Lke=1 Xx (2) 
j#i k#i 

TFP; = Ryet Xji (3) 
j#i 

TFN, = Yijnt Xij (4) 
j#i 


From the TPP, TTN, TFP and TFN results, then it will be used to calculate the overall accuracy 
(OA), precision (P), recall (R), f-measure score (F1) of each class i calculated using (5)-(8) [27], [28]. 


OA= TTPail (5) 
(Total Number of Entries) 
Eh (TTP ait) (6) 


f (TTP qi + TFP)) 
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In this study, the experiment was carried out by applying batch size 32 to the trained model and 
batch size 1 to the test. The data randomization process was used during the training. Adam optimizer [29] 
with a learning rate of 0.0009 was used to minimize the loss function on the CNN model during the training 
process. All experiments conducted in this study use TensorFlow which is on Google Colaboratory as a cloud 
computing provider platform that has 12 Gb RAM specifications and an Nvidia K80s GPU. 


3. RESULTS AND DISCUSSION 

This section will describe the results of each experiment was conducted. Experiments were carried 
out on training data and validation data using the CNN InceptionV3, ResNet50, InceptionResnetV2, 
DenseNet201, MobileNet, and EfficientNetB3 architectural models. This experiment aims to find to 
accuracy, loss, and computation time required of each architectural model during the data training process. 
This experiment will be carried out using 50 training epochs. The results of the accuracy, loss, and training 
computation time of each CNN architecture can be seen in Table 3. 


Table 3. Accuracy, loss and time computing training data and validation of CNN architecture after 50 
training epochs 


CNN Arsitektur Train Acc (%) Val Acc(%) TrainLoss ValLoss Time (Minute) 
Inception V3 99,34 99,92 0,0215 0,0039 77 
ResNet50 98,65 99,75 0,0408 0,0104 86 
InceptionResnetV2 99,61 99,83 0,0142 0,0043 117 
DenseNet201 99,12 99,66 0,0242 0,0111 98 
MobileNet 97,84 99,24 0,0573 0,0238 72 
EfficientNetB3 85,48 89,81 0,5387 0,3465 112 


From the result of experiments that have been done, InceptionResnetV2 gets the best results from all 
CNN architectural models that are trained with an accuracy value of 99.61%. Then followed by Inception V3 
architecture with 99.34% accuracy, DenseNet201 with 99.12% accuracy, ResNet50 with 98.65% accuracy, 
MobileNet with 97.84% accuracy and EfficientNetB3 with 85.48% accuracy. Although InceptionResnetV2 
gets the best accuracy value of all architectures, it requires the longest computation time among all trained 
architectures to achieve 50 training epochs of 117 minutes. MobileNet gets the best compute time to reach 50 
epoch, which is 72 minutes, followed by InceptionV3 with 77 minutes of computing time, ResNet50 with 86 
minutes of computation time, DenseNet201 with 98 minutes of computation time, and EfficientNetB3 with 
112 minutes of computation time. 

Table 3 also displays the training loss of each trained CNN architectural model. The architecture that 
has the lowest training loss is InceptionResnetV2 with a value of 0.0142, followed by InceptionV3 with a 
value of 0.0215, DenseNet201 with a value of 0.0242, ResNet50 with a value of 0.0408, MobileNet with a 
value of 0.0573, and EfficienNetB3 with a value of 0.5387. The accuracy and loss charts in the training 
process can be seen in Figures 4-9. 
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Figure 4. Accuracy dan loss training InceptionV3 
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Figure 5. Accuracy dan loss training ResNet50 
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Figure 7. Accuracy dan loss training DenseNet201 
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Figure 9. Accuracy dan loss training EfficientNetB3 


Table 4 presents the evaluation matrix between the trained model and the testing data. The results of 
the test experiments conducted, InceptionResnetV2 and InceptionV3 got the best results from all the CNN 
architectural models tested with 100% accuracy, 100% precision, 100% Recall, and 100% F1 Score. Then 
followed by the Resnet50 and DenseNet201 architectures got an accuracy value of 99.83, precision 99.83%, 
Recall 99.83%, and F1 Score 99.83%. MobileNet gets an accuracy value of 99.33%, precision 99.36%, 
Recall 99.32%, F1 Score 99.34% and EfficientNetB3 gets an accuracy value of 90.14%, precision 90.46%, 
Recall 90.23%, F1 Score 90.28%. 


Table 4. Evaluation of model training with data testing 
CNN architecture | Test ACC (%) _Precision(%) Recall (%) Fl score (%) 


Inception V3 100 100 100 100 
ResNet50 99,83 99,83 99,83 99,83 
InceptionResnetV2 100 100 100 100 
DenseNet201 99,83 99,83 99,83 99,83 
MobileNet 99,33 99,36 99,32 99,34 
EfficientNetB3 90,14 90,46 90,23 90,28 


Figure 10 shows the results of the InceptionV3 architecture model confusion matrix with data 
testing. From the 1187 sample data tested, no data were misclassified. All data were classified correctly as 
shown in Figure 10. The accuracy, precision, recall, and F1 score values can be seen in Table 4. 

Figure 11 shows the results of the ResNet50 architecture model confusion matrix with data testing. 
From the 1187 sample data tested, there are 2 sample data which are misclassified. 1 sample of misclassified 
brown spot class data and | misclassified tungro class data sample as shown in Figure 11. The values of 
accuracy, precision, recall and F1 Score can be seen in Table 4. 
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Figure 12 shows the results of the confusion matrix of the InceptionResnetV2 architectural model 
with data testing. From the 1187 sample data tested, no data were misclassified. All data were classified 
correctly as shown in Figure 12. The value of accuracy, precision, recall and Fl Score can be seen in Table 4. 

Figure 13 shows the results of the DenseNet201 architectural model confusion matrix with data 
testing. From the 1187 sample data tested, there are 2 data samples in the misclassified brown spot class as 
shown in Figure 13. The values for accuracy, precision, recall and Fl Score can be seen in Table 4. 

Figure 14 shows the results of the MobileNet architecture model confusion matrix with data testing. 
From the 1187 sample data tested, 8 sample data were misclassified. Two samples of misclassified bacterial 
blight class data, and 6 misclassified blast class data samples as shown in Figure 14. The values for accuracy, 
precision, recall, and Fl Score can be seen in Table 4. 

Figure 15 shows the results of the confusion matrix model for the EfficientNetB3 architecture with 
data testing. From the 1187 sample data tested, 117 sample data were misclassified. The 36 samples of 
misclassified bacterial blight class data, 57 misclassified blast class data samples, 18 misclassified brown 
spot class data samples and 6 misclassified tungro class data samples as shown in Figure 15. The accuracy, 
precision, recall, and Fl-Score values can be seen in Table 4. 
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Furthermore, after comparisons of the results of this experiment with several previous 
studies/research. By using the CNN architecture, the best performance can be obtained from this experiment 
for classifying diseases in rice leaves beyond conventional methods such as KNN [6], logistic regression, 
decision tree (DT), naive bayes (NB) [7], SVM [8], and ANN [10]. The experiments in this study also have 
better performance than other CNN architectures, like VGG16 [11], AlexNet [13], and VGG19 [14]. 


4. CONCLUSION 

After evaluating the experiments of each CNN architectural model, the best architectures for the 
classification of rice leaf disease are InceptionV3 and InceptionResnetV2, with an accuracy of 100%. Then 
followed by the ResNet50 architecture with an accuracy of 99.83%, DenseNet201 99.83%, MobileNet 
99.33% and EffecientNetB3 90.14%. This experiment was carried out using Adam's optimization and 
modifying the batch size and learning rate. The result shows the proposed CNN model exceeds the 
conventional methods and other CNN architectures found in previous studies in the classification of rice leaf 
disease. It is important to develop a CNN model for further research that has better training time and 
accuracy. It is necessary to add more data on the types of diseases on rice leaves and more types of pests on 
rice leaves. 
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